Serial No.: 09/745,795 
IN THE CLAIMS : 

1. (Currently Amended) A character string dividing system 
for segmenting a character string into a plurality of words, 
comprising: 

input means for receiving a document; 

document data storing means serving as a document database 
for storing a received document; 

character joint probability calculating means for 
calculating a character joint probability that represents a 
probability of two neighboring characters appearing immediate 
next to each other in said document database; 

probability table storing means for storing a table of 
calculated character joint probabilities; 

character string dividing means for segmenting an objective 
character string into a plurality of words with reference to 
said table of calculated character joint probabilities; and 

output means for outputting a division result of said 
objective character string. 
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2. (Currently Amended) A character string dividing method 
for segmenting a character string into a plurality of words, 
said method comprising the otcpa of : 

a tat i s t i cal ly — calculating a character joint probability 
that represents a probability of two neighboring characters 
appearing immediately next to each other in a given — document 
database ; and 

segmenting an objective character string into a plurality 
of words with reference to calculated character joint 
probabilities so that each division point of said objective 
character string is present between two neighboring characters 
having a smaller character joint probability. 

3. (Currently Amended) A character string dividing method 
for segmenting a character string into a plurality of words , 
said method comprising th e 3t e po of : 

statistically — calculating a character joint probability 
that represents a probability of two neighboring characters 
appearing immediately next to each other in a given document 
database, said character joint probability being calculated as 
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an appearance probability of a specific character string 
appearing immediately before a specific character, said specific 
character string including a former one of said two neighboring 
characters as a tail thereof and said specific character being a 
latter one of said two neighboring characters ; and 

segmenting an objective character string into a plurality 
of words with reference to calculated character j oin t 
probabilities so that each division point of said objective 
character string is present between two neighboring characters 
having a smaller character joint probability. 

4 . (Currently Amended) A character string dividing method 
for segmenting a character string into a plurality of words, 
said method comprisin g the otopo of : 

statistically calculating a character joint probability 
that represents a probability of two neighboring characters 
appearing immediately next to each other in a given document 
database, said character joint probability being calculated as 
an appearance probability of a first character string appearing 
immediately before a second character string, said first 
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character string including a former one of said two neighboring 
characters as a tail thereof and said second character string 
including a latter one of said two neighboring characters as a 
head thereof; and 

segmenting an objective character string into a plurality 
of words with reference to calculated character joint 
probabilities so that each division point of said objective 
character string is present between two neighboring characters 
having a smaller character joint probability. 

5. (Currently Amended) The character string dividing 
method in accordance with claim 4, wherein said character joint 
probability of two neighboring characters is calculated based on 
a first probability of said first character string appearing 
immediately before said latter one of said two neighboring 
characters and also based on a second probability of said second 
character string appearing immediately after said former one of 
said two neighboring characters. 
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6. (Currently Amended) A character string dividing method 
for segmenting a character string into a plurality of words, 
said method comprising the steps of : 

jLaLifltioally calculating a character joint probability 
that represents a probability of two neighboring characters 
appearing immediately ne *t to each other in a given document 
database prepared for learning purpose; and 

segmenting an objective character string into a plurality 
of words with reference to calculated character joint 
probabilities so that each division point of said objective 
character string is present between two neighboring characters 
having a smaller character joint probability, 

wherein, when said objective character string involves a 
sequence of characters not involved in said document database, a 
character joint probability of any two neighboring characters 
not appearing in said database is estimated based on said 
calculated character joint probabilities for the neighboring 
characters stored in said document database. 
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7 . (Currently Amended) The character string dividing 
method in accordance with claim 2 , wherein said division point 
of said objective character string is determined based on a 
comparison between the character j oin t probability and a 
threshold, and said threshold is determined with reference to an 
average word length of resultant words. 

8. (Currently Amended) The character string dividing 
method in accordance with claim 2, wherein a changing point of 
character type is consid e red as a prospective division point of 
said objective character string . 

9. (Cancelled) 

10. (Currently Amended) A character string dividing system 
for segmenting a character string into a plurality of words f 
comprising: 

input means for receiving a document; 

document data storing means serving as a document database 
for storing a received document; 
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character joint probability calculating means for 
calculating a character j oint probability that represents a 
probability of two neighboring characters appearing immediately 
next to each other in said document database ; 

probability table storing means for storing a table of 
calculated character joint probabilities; 

word dictionary storing means for storing a word dictionary 
prepared or produced beforehand; 

division pattern producing means for producing a plurality 
of candidates for a division pattern of an objective character 
string with reference to information of said word dictionary; 

correct pattern selecting means for selecting a correct 
division pattern from said plurality of candidates with 
reference to said table of character joint probabilities; and 

output means for outputting said selected correct division 
pattern as a division result of said objective character string. 
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11. (Currently Amended) A character string dividing method 
for segmenting a character string into a plurality of words, 
said method comprising the otopo of : 

3 ta ti s tieal ly — calculating a character joint probability 
that represents a probability of two neighboring characters 
appearing immediately next to each other in a given document 
database ; 

storing calculated character joint probabilities; and 
segmenting an objective character string into a plurality 
of words with reference to a word dictionary, 

wherein, when there are a plurality of candidates for a 
division pattern of said objective character string, a correct 
division pattern is selected from said plurality of candidates 
with reference to calculated character joint probabilities so 
that each division point of said objective character string is 
present between two neighboring characters having a smaller 
character joint probability. 
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12. (Currently Amended) The character string dividing 
method in accordance with claim 11, wherein a score of each 
candidate is calculated when there are a plurality of candidates 
for a division pattern of said objective character string, 

said score is a sum of character joint probabilities at 
respective division points of said objective character string in 
accordance with a division pattern of said each candidate, and 

a candidate having the smallest score is selected as said 
correct division pattern. 

13. (Currently Amended) The character string dividing 
method in accordance with claim 11, wherein 

a score of each candidate is calculated when there are a 
plurality of candidates for a division pattern of said objective 
character string, 

said score is a product of character joint probabilities at 
respective division points of said objective character string in 
accordance with a division pattern of said each candidate, and 

a candidate having the smallest score is selected as said 
correct division pattern. 
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14. (Currently Amended) The character string dividing 
method in accordance with claim 11, wherein 

a calculated character joint probability is given to each 
division point of said candidate; 

a constant value is assigned to each point between two 
characters not divided; 

a score of each candidate is calculated based on a sum of 
said character joint probability and said constant value thus 
assigned; and 

a candidate having the smallest score is selected as said 
correct division pattern. 

15. (Currently Amended) The character string dividing 
method in accordance with claim 11, wherein a calculated 
character joint probability is given to each division point of 
said candidate; 

a constant value is assigned to each point between two 
characters not divided; 
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a score of each candidate is calculated based on a product 
of said character joint probability and said constant value thus 
assigned; and 

a candidate having the smallest score is selected as said 
correct division pattern. 

16. (Currently Amended) A character string dividing system 
for segmenting a character string into a plurality of words, 
comprising: 

input means for receiving a document ; 

document data storing means serving as a document database 
for storing a received document; 

character joint probability calculating means for 
calculating a character joint probability that represents a 
probability of two neighboring characters appearing immediately 
next to each other in said document database; 

probability table storing means for storing a table of 
calculated character joint probabilities; 

word dictionary storing means for storing a word dictionary 
prepared or produced beforehand; 
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unknown word estimating means for estimating unknown words 
not registered in said word dictionary; 

division pattern producing means for producing a plurality 
of candidates for a division pattern of an objective character 
string with reference to information of said word dictionary and 
said estimated unknown words; 

correct pattern selecting means for selecting a correct 
division pattern from said plurality of candidates with 
reference to said table of character joint probabilities; and 

output means for outputting said selected correct division 
pattern as a division result of said objective character string. 

17. (Currently Amended) A character string dividing method 
for segmenting a character string into a plurality of words, 
said method comprising th e steps of : 

statistically — calculating a character joint probability 
that represents a probability of two neighboring characters 
appearing immediately next to each other in a given document 
database ; 

storing calculated character joint probabilities; and 
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segmenting an objective character string into a plurality 
of words with reference to dictionary words and estimated 
unknown words, 

wherein, when there are a plurality of candidates for a 
division pattern of said objective character string, a correct 
division pattern is selected from said plurality of candidates 
with reference to calculated character joint probabilities so 
that each division point of said objective character string is 
present between two neighboring characters having a smaller 
character joint probability. 

18. (Original) The character string dividing method in 
accordance with claim 17, wherein it is checked if any word 
starts from a certain character position (i) when a preceding 
word ends at a character position (i-1) and, when no dictionary 
word starting from said character position (i) is present, 
appropriate character strings are added as unknown words 
starting from said character position (i) , where said character 
strings to be added have a character length not smaller than n 
and not larger than m, where n and m are positive integers. 
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19. (Original) The character string dividing method in 
accordance with claim 17, wherein 

a constant value given to said unknown word is larger than 
a constant value given to said dictionary word, 

a score of each candidate is calculated based on a sum of 
said constant values given to said unknown word and said 
dictionary word in addition to a sum of calculated joint 
probabilities at respective division points, and 

a candidate having the smallest score is selected as said 
correct division pattern. 

20. (Original) The character string dividing method in 
accordance with claim 17, wherein 

a constant value given to said unknown word is larger than 
a constant value given to said dictionary word, 

a score of each candidate is calculated based on a product 
of said constant values given to said unknown word and said 
dictionary word in addition to a product of calculated joint 
probabilities at respective division points, and 
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a candidate having the smallest score is selected as said 
correct division pattern. 
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